Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support multiple MOFED DS #691

Merged
merged 1 commit into from
Mar 13, 2024
Merged

Conversation

rollandf
Copy link
Member

Mofed driver precompiled container images
are compiled using a specific Kernel.

As a result, the Mofed Driver DaemonSet should
have the Kernel as part of the Node Selector.

In addition, since there can be Nodes with different Kernel versions, a DaemonSet for each existing Kernel in the cluster is created.

@rollandf rollandf added the on hold This enhancement is currently on hold pending additional clarification and evaluation label Nov 22, 2023
@rollandf
Copy link
Member Author

/retest-all

@rollandf rollandf force-pushed the nodepool branch 7 times, most recently from e765759 to f1602c3 Compare December 18, 2023 17:03
@rollandf
Copy link
Member Author

/retest-all

@rollandf rollandf force-pushed the nodepool branch 2 times, most recently from 329b06a to 57a1c3d Compare December 31, 2023 20:17
@rollandf
Copy link
Member Author

rollandf commented Jan 1, 2024

/retest-all

@rollandf
Copy link
Member Author

rollandf commented Jan 1, 2024

/retest-blackduck_scan

@rollandf
Copy link
Member Author

rollandf commented Jan 1, 2024

/retest-blackduck_scan

@rollandf rollandf force-pushed the nodepool branch 2 times, most recently from f1501b6 to 7556451 Compare January 3, 2024 21:13
@rollandf
Copy link
Member Author

rollandf commented Jan 3, 2024

Depends on NVIDIA/k8s-operator-libs#8

@rollandf rollandf force-pushed the nodepool branch 2 times, most recently from b3d392c to bf02b16 Compare January 11, 2024 08:51
@rollandf rollandf force-pushed the nodepool branch 2 times, most recently from ff1ffc2 to b635fad Compare March 3, 2024 09:44
@rollandf rollandf removed the on hold This enhancement is currently on hold pending additional clarification and evaluation label Mar 3, 2024
Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question - in the default case for someone updating to Network Operator 24.04 this will cause a full rollout of the DS across all nodes, even if the version of OFED etc. remains the same. The only way to avoid this AFAICT is to opt out of migration using an environment variable.

Is this the expected upgrade path for users in general? i.e. a network operator update requires a rollout of OFED - meaning nodes are offline for a non-trivial period?

pkg/nodeinfo/attributes.go Outdated Show resolved Hide resolved
pkg/nodeinfo/node_info.go Show resolved Hide resolved
Comment on lines 88 to 94
func GetStringHash(s string) string {
hasher := fnv.New32a()
if _, err := hasher.Write([]byte(s)); err != nil {
panic(err)
}
return rand.SafeEncodeString(fmt.Sprint(hasher.Sum32()))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably just keep this private and beside where it's used unless there's a reason to have it in this Utils package

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, moved to state package

pkg/utils/utils_test.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
Comment on lines -170 to 179
if err := migrate.Migrate(stopCtx, setupLog.WithName("migrate"), directClient); err != nil {
setupLog.Error(err, "failed to run migration logic")
os.Exit(1)
migrationCompletionChan := make(chan struct{})
m := migrate.Migrator{
K8sClient: directClient,
MigrationCh: migrationCompletionChan,
LeaderElection: enableLeaderElection,
Logger: ctrl.Log.WithName("Migrator"),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for moving this logic to a runnable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to use the leader election mechanism and make sure it runs only when we get the leader election by using mgr.Add(&m) which requires a runnable.
Otherwise old instance of operator can recreate the DS after the new one deleted.
In Helm, we have a pre-hook that scale down the deployment, but we won't have it in OpenShift

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to deprecate the helm hooks in favor of this approach?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one of thing we would like is to get rid of creating the instance of NicClusterPolicy as part of the helm before we go and remove this hook.

The process should be:

  • upgrade operator
  • wait for operator to be running
  • update NCP with new values

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to get this prioritized IMO - makes a lot of things simpler for deploying network operator.

@rollandf
Copy link
Member Author

rollandf commented Mar 4, 2024

General question - in the default case for someone updating to Network Operator 24.04 this will cause a full rollout of the DS across all nodes, even if the version of OFED etc. remains the same. The only way to avoid this AFAICT is to opt out of migration using an environment variable.

Is this the expected upgrade path for users in general? i.e. a network operator update requires a rollout of OFED - meaning nodes are offline for a non-trivial period?

Usually, Network Operator release comes with a new MOFED, meaning a rollout of nodes during the upgrade.
For one time, when this feature is out, if user use same MOFED as before an upgrade will still happens, we could not find another option.
The env variable is not a real option for end users, this is just an escape plan if things go wrong. We want all users to migrate to the multiple DS to support pre-compiled MOFED.

Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of smaller comments - but looks generally good to me

@@ -504,7 +516,7 @@ func (s *stateOFED) getInitContainerConfig(
// getMofedDriverImageName generates MOFED driver image name based on the driver version specified in CR
// TODO(adrianc): in Network-Operator v1.5.0, we should just use the new naming scheme
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this TODO? Not sure exactly what it's refereeing to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

reqLogger.V(consts.LogLevelInfo).Info("No nodes with Mellanox NICs where found in the cluster.")
return []*unstructured.Unstructured{}, nil
}
objs := make([]*unstructured.Unstructured, 0)
renderedObjsMap := stateObjects{}
for _, np := range nodePools {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we break out what's inside this loop into a seperate function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

for _, np := range nodePools {
nodePool := np

useDtk := clusterInfo.IsOpenshift() && config.FromEnv().State.OFEDState.UseDTK
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think some of the actions in this loop - setting defaults on the NCP, checking if we useDTK, certconfig and repo config can be handled outside this loop and just passed into render the manifest when we need it. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
certconfig and repo config need to be in the loop it needs osname, that in theory can different in pools

Comment on lines +69 to +74
select {
case <-r.MigrationCh:
case <-ctx.Done():
return ctrl.Result{}, fmt.Errorf("canceled")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add this to all controllers - even if not strictly needed right now. Might even be worth doing a reconcile-wrapper, but I think this snipped as is in all controllers is good enough for now. Maybe also add a comment to give folks clues as to what we're waiting for at this point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it would be good to make it in a separate commit

Copy link
Collaborator

@ykulazhenkov ykulazhenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the PR. It looks good. LGTM after rebase and addressing Killian's comments.

Copy link
Collaborator

@e0ne e0ne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once it will be rebased

@rollandf rollandf force-pushed the nodepool branch 2 times, most recently from cde2088 to 7d70631 Compare March 7, 2024 12:31
ykulazhenkov
ykulazhenkov previously approved these changes Mar 11, 2024
Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits around logging - maybe worth taking as a follow-up. One log that's wrong and elsewhere we shouldn't be logging if we're returning the error.

If we want to add context we should wrap the error but let controller runtime do the logging so it's not confusing and duplicated for users.

@@ -55,6 +84,11 @@ func migrate(ctx context.Context, log logr.Logger, c client.Client) error {
log.V(consts.LogLevelError).Error(err, "error trying to remove state label on NV IPAM configmap")
return err
}
if err := handleSingleMofedDS(ctx, log, c); err != nil {
// critical for the operator operation, will fail Mofed migration
log.V(consts.LogLevelError).Error(err, "error trying to remove state label on NV IPAM configmap")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message should be updated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

log.V(consts.LogLevelDebug).Info("NIC ClusterPolicy not found, skip handling single MOFED DS")
return nil
} else if err != nil {
log.V(consts.LogLevelError).Error(err, "fail to get NIC ClusterPolicy")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to log if we're returning the error here AFAIK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Mofed driver precompiled container images
are compiled using a specific Kernel.

As a result, the Mofed Driver DaemonSet should
have the Kernel as part of the Node Selector.

In addition, since there can be Nodes with different
Kernel versions, a DaemonSet for each existing Kernel
in the cluster is created.

In the Migration module, the former DS is deleted
with DeletePropagationOrphan so that MOFED pods will
still exists until manual or auto-upgrade is done.

Signed-off-by: Fred Rolland <[email protected]>
Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

/approve

Copy link
Collaborator

@ykulazhenkov ykulazhenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +69 to +74
select {
case <-r.MigrationCh:
case <-ctx.Done():
return ctrl.Result{}, fmt.Errorf("canceled")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it would be good to make it in a separate commit

@rollandf rollandf merged commit 7193b1e into Mellanox:master Mar 13, 2024
16 checks passed
@rollandf rollandf deleted the nodepool branch August 27, 2024 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants